Skip to content

Upgrade to text-embedding-3-large model as default, with vector storage optimizations #2470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

pamelafox
Copy link
Collaborator

@pamelafox pamelafox commented Apr 1, 2025

Purpose

This pull request changes the default embedding model to text-embedding-3-large, with 3072 dimensions, along with these AI Search vector storage optimizations:

  • Truncate dimensions to 1024
  • Binary quantization
  • Preserve originals in the search index for rescoring
  • Don't store the vectors in the search index itself

See this notebook for a demonstration of the effects of those optimizations. Due to the rescoring, the search quality remains high.

This PR introduces a new environment variable AZURE_SEARCH_FIELD_NAME_EMBEDDING so that developers can theoretically have multiple fields in their index, for different embedding sizes/models.

This PR also changes the SKU for all models to GlobalStandard. It's becoming really tricky to find a region for the Standard SKU that works for all the models. Some developers may not be comfortable with GlobalStandard, depending on their regulations, so they can still change the SKU manually as desired.

Fixes #2383

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[X] Yes - I am trying to make it backwards compatible, but it's hard! I suspect that developers that recently deployed gpt-4o-mini with Standard sku will get an error, and need to run `azd env set` to change the deployment name or sku name.
[ ] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[X] No

Type of change

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

Copy link

github-actions bot commented Apr 2, 2025

Check Country Locale in URLs

We have automatically detected added country locale to URLs in your files.
Review and remove country-specific locale from URLs to resolve this issue.

Check the file paths and associated URLs inside them.
For more details, check our Contributing Guide.

File Full Path Issues
docs/deploy_features.md
#LinkLine Number
1https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=global-standard%2Cstandard-chat-completions#models-by-deployment-type197

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR upgrades the default embedding model to "text-embedding-3-large" (3072 dimensions) and implements several vector storage optimizations including truncation, binary quantization, and preserving original values for rescoring. It also introduces new environment variables for embedding field names and updates documentation and related code to support the new configuration.

  • Updated tests and environment variables for the new embedding model and dimensions.
  • Revised documentation to reflect model and deployment changes.
  • Refactored search management and approach modules to use dynamic embedding field names.

Reviewed Changes

Copilot reviewed 24 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/conftest.py Updated mocked model and dimensions for the new embedding model.
docs/gpt4v.md Changed embedding model reference from ada to text-embedding-3-large.
docs/deploy_features.md Updated deployment instructions and embedding model references.
docs/deploy_existing.md Adjusted instructions for existing deployments to use the new model.
azure.yaml Added new environment variables for embedding field names.
app/backend/prepdocslib/searchmanager.py Refactored index creation to use dynamic embedding field names and profiles.
app/backend/integratedvectorizerstrategy.py Passed new search field names into indexer skill configuration.
app/backend/prepdocs.py Updated embedding field configuration from environment variables.
app/backend/approaches/* Modified constructors and vector field usages to accept new embedding field.
app/backend/app.py Integrated new environment variables for embedding field names in client setup.
.github/workflows/azure-dev.yml & .azdo/pipelines/azure-dev.yml Included new environment variable exports for embedding field names.
Files not reviewed (3)
  • app/backend/requirements.txt: Language not supported
  • infra/main.bicep: Language not supported
  • infra/main.parameters.json: Language not supported

Comment on lines +133 to 134
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embeddig model, you can do so by following these steps:

Copy link
Preview

Copilot AI Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the word 'embeddig'. Please change it to 'embedding'.

Suggested change
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embeddig model, you can do so by following these steps:
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embedding model, you can do so by following these steps:

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

@pamelafox pamelafox marked this pull request as ready for review April 3, 2025 00:09
"id": self.id,
"content": self.content,
# Should we rename to its actual field name in the index?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pending question as to whether to send it down using the field of the index?

@@ -317,7 +321,8 @@ class ExtraArgs(TypedDict, total=False):
**dimensions_args,
)
query_vector = embedding.data[0].embedding
return VectorizedQuery(vector=query_vector, k_nearest_neighbors=50, fields="embedding")
# TODO: use optimizations from rag time journey 3
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if field == "embedding"
else await self.compute_image_embedding(query_text)
await self.compute_image_embedding(query_text)
if field.startswith("image")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bit tricky, feels a bit code smelly.

if field == "embedding"
else await self.compute_image_embedding(q)
await self.compute_image_embedding(q)
if field.startswith("image")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same logic here

@@ -83,7 +85,7 @@ async def run(
minimum_reranker_score = overrides.get("minimum_reranker_score", 0.0)
filter = self.build_filter(overrides, auth_claims)

vector_fields = overrides.get("vector_fields", ["embedding"])
vector_fields = overrides.get("vector_fields", [self.embedding_field])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here, need to decide whether the frontend should render the actual field names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with Integrated vectorization enabled
1 participant